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ABSTRACT 

Introduction: Online social networks represent a 
potential mechanism for the dissemination of health 
interventions including smoking cessation; however, 
which elements of an intervention determine diffusion 
between participants is unclear. Diffusion is frequently 
measured using R, the reproductive rate, which is 
determined by the duration of use (t), the 
'contagiousness' of an intervention (p) and a 
participant's total contacts (z). We have developed a 
Facebook 'app' that allows us to enable or disable 
various components designed to impact the duration 
of use (expanded content, proactive contact), 
contagiousness (active and passive sharing) and 
number of contacts (use by non-smoker supporters). 
We hypothesised that these elements would be 
synergistic in their impact on R, while including 
non-smokers would induce a 'carrier' state allowing the 
app to bridge clusters of smokers. 
Methods and analysis: This study is a fractional 
factorial, randomised control trial of the diffusion of a 
Facebook application for smoking cessation. 
Participants recruited through online advertising are 
randomised to 1 of 12 cells and serve as 'seed' users. 
All user interactions are tracked, including social 
interactions with friends. Individuals installing the 
application that can be traced back to a seed 
participant are deemed 'descendants' and form the 
outcome of interest. Analysis will be conducted using 
Poisson regression, with event count as the outcome 
and the number of seeds in the cell as the exposure. 
Results: The results will be reported as a baseline RO 
for the reference group, and incidence rate ratio for the 
remainder of predictors. 

Ethics and Dissemination: This study uses an 
abbreviated consent process designed to minimise 
barriers to adoption and was deemed to be minimal 
risk by the Institutional Review Board (IRB). Results 
will be disseminated through traditional academic 
literature as well as social media. If feasible, 
anonymised data and underlying source code are 
intended to be made available under an open source 
license. 

ClinicalTrials.gov registration number: 

NCT01 746472. 



INTRODUCTION 

Smoking remains the leading cause of 443 000 
preventable deaths and US$200 billion in excess 
cost in the USA each year,^ making a large-scale 
reduction in smoking prevalence a public 
health imperative. Yet, evidence-based interven- 
tions recommended by the Clinical Practice 
Guideline for Tobacco Dependence Treatment 
('2008 Guideline')^ do not reach the vast major- 
ity of the 44 million current smokers in the 
USA.^^ A major paradigm shift in how cessation 
interventions are developed is needed, targeting 
a large-scale dissemination and diffusion.^ 

In theory, the broad reach and effectiveness 
of evidence-based Internet cessation pro- 
grammes should yield enormous impact 
(reach X efficacy^) in reducing the population 
prevalence of smoking. The majority (85%) of 
US adults are Internet users, including popula- 
tions at disproportionate risk for smoking: 85% 
of African Americans and 76% of those with 
incomes less than US$30 000/year use the 
Internet.^ Between 6% and 9% of all Internet 
users (>10 million adults) search for quitting 
smoking annually.^ ^ Studies and multiple 
meta-analyses^^~^^ show that Internet interven- 
tions are effective with a relative risk of abstin- 
ence of 1.44^^ and quit rates of 7-26%.^^^^ 
Despite this promise, however, only one-third 
of smokers searching the Internet actually 
reach the limited number of websites that 
provide cessation treatment consistent with the 
2008 Guideline.^^^^ Most existing Internet ces- 
sation interventions that involve social support 
— a key element of tobacco dependence treat- 
ment — introduce participants into 'artificial' 
networks in which individuals have no initial 
connections and often create none. In such 
networks, participation is limited by affiliation 
with a particular behaviour (eg, quitting 
smoking) . As a result, potential network effects 
on individual behaviour and the potential for 
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dissemination are sharply limited by high levels of attri- 
tion,^^ the fact that most registrants never form a single 
connection and those that are formed may be weak and 
transitory 

Online social networks may represent a more powerful 
dissemination channel for evidence-based tobacco 
dependence treatments. In contrast to the 'build it and 
they will come' model inherent in smoking-specific 
online cessation interventions, more general online 
social networks can be used to deliver proven cessation 
intervention elements to smokers 'where they are'. 
Two-thirds (67%) of the US adult Internet users use at 
least one social networking site such as Facebook or 
Twitter; importantly, nearly 80% of adults aged 18-49 do 
so,^^ and 41% of those do so multiple times a day.^^ 
This increasing penetration of online social networks 
into the fabric of the typical American's life provides fas- 
cinating, if challenging, opportunities for intervention 
design. Interventions delivered in the context of an 
online social network can leverage the availability of an 
individuals' self-identified social ties not only to optimise 
support for cessation, but also for active and passive dis- 
tribution of the intervention through an individual's 
network to other smokers and beyond. 

The importance of social networks on smoking behav- 
iour and 'viral diffusion' has been seen in real-world net- 
works. Data from the Framingham Heart Study 
demonstrated that smokers tend to cluster within their 
social networks and are significantly less integrated, that 
these patterns persist over time, and that clusters of 
smokers tend to quit together. These findings suggest 
that interventions should target not just individual 
smokers but also their surrounding social network. This 
proposition is supported by robust evidence that social 
networks strongly affect social norms, and that within a 
network, norms may be altered by a single individual 
and perpetuated by other network members. In add- 
ition, non-smokers may serve as 'weak ties' spanning 
clusters of smokers, analogous to a disease carrier who 
transports the disease between remote villages. 
Recruiting non-smokers to a cessation intervention may 
augment available social support for cessation, and 
increase social pressure (complex contagion effects) on 
other smokers to participate, thereby facilitating viral 
spread.^^ 

This study aims to identify the variables that drive 
adoption and dissemination of an intervention through- 
out a network. The study examines this question within 
Facebook, the single largest and fastest growing online 
social network. As of 2013, approximately 143 million 
Americans use Facebook daily, with users globally 
sharing an average of 4.7 billion items of content daily. 
Facebook enables individuals to create a profile, identif)^ 
other members who are friends, exchange messages 
through multiple channels and — most relevant to this 
study — install small applications ('apps') created by 
third parties. These apps rely on 'viral' diffusion to grow 
their user base, and achieve this by inducing users to 
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Figure 1 Facebook Timeline. 

'invite' their friends and enabling them to postinforma- 
tion about the app to their personal 'Timeline' (essen- 
tially synonymous with the terms 'wall' or 'page') where 
it can be seen by others (figure 1). Data from Facebook 
suggest that individuals actively communicate with only 
5% of their average 120 friends, but are passively 
exposed to information about 2-2.5 times as many.^^ By 
exposing a smoker's entire social network to a stream of 
'pushed' information in real-time about their cessation 
progress (eg, 'Mary set a quit date'), it may be possible 
to significantly enhance social support for cessation 
(generating the response 'way to go Mary!' by network 
ties) and facilitate active and passive diffusion of the 
intervention. 

The primary outcome metric of this study is the effi- 
ciency of this diffusion process, defined as the reproduct- 
ive rate (R). Online social networks depend on viral 
spread for dissemination of applications, a concept similar 
to snowball recruitment where investigators recruit only 
the 'seed' individual, and successive generations are 
recruited by the seed and their descendants. Within epi- 
demiology, R is quantified as the mean number of second- 
ary cases ('infections') that occur for a given 'infected' 
individual.^^ For online interventions, we can quantify the 
number of contacts of an individual and the duration that 
they are 'infectious' (ie, actively using an application). In 
this context, R can be expressed as: 

R = tpZ 

t indicates the duration of being contagious (ie, the dur- 
ation an individual uses an application), (3 is a constant of 
probability that determines the likelihood of spread from 
one individual to another for a given unit of time 
(referred to as 'contagiousness' hereafter), and Z is the 
number of contacts within the network. For an application 
with no diffusion, R will equal 0. For applications with R 
greater than 1, exponential growth will occur as each par- 
ticipant recruits/infects at least one other person; applica- 
tions with R<1 will require ongoing seeding to maintain 
population growth. Increasing the amount of time (t) that 
an application is used will increase the likelihood that it 
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spreads to new hosts. The goal of app developers is to 
reach an epidemic threshold where R exceeds 1 (ie, the 
app 'goes viral') and the application propagates autono- 
mously, thus no longer requiring expenditures to recruit 
seed users.^'^ While exceeding an R value of 1 is highly 
desirable, those cases where an epidemic threshold is not 
crossed can still serve as a multiplier for recruitment 
efforts. For example, 1000 individuals recruited to an inter- 
vention with R=0.2 will yield 1250 participants over five 
generations of viral diffusion. 

Aims 

The primary aim of the study is to identify and character- 
ise the intervention characteristics that catalyse its diffu- 
sion through an online social network. We have tied our 
exploration to the concept of the R and the three inde- 
pendent variables that are its determinants — duration of 
use (t), contagiousness ((3) and number of contacts (Z). 
We empirically constructed domains of online interven- 
tion elements that we believed could impact each variable 
with minimal overlap: information content and proactive 
contact (t, duration of use), social communications ((3, 
contagiousness) and non-smoker integration (z, number 
of contacts). We hypothesise that intervention variants 
containing greater information content, ongoing pro- 
active contact and active communication strategies will 
outperform control application variants and have higher 
Rs, and that their combination will be synergistic and will 
display positive interaction effects. We also hypothesise 
that the involvement of non-smokers as epidemiological 
'carriers' will allow the application to spread more effi- 
ciently by bridging clusters of smokers. 

The secondary aim is to identify and characterise the 
local (ego) networks of participants that effect diffusion 
and quitting behaviour. The characteristics of interest 
include smoking status, nicotine dependence, age, 
gender and local network characteristics (number of 
friends, network density and social position) as well as 
their number of friends already using the application. 
We hypothesise that local social network characteristics 
will predict the adoption behaviour of friends after invi- 
tation (active communication) or exposure (passive 
communication) from the participant. In addition, we 
hypothesise that invitation, adoption, utilisation and 
early cessation behaviour will display a complex conta- 
gion pattern, where increasing levels of network penetra- 
tion (more friends that are pre-existing users of the 
application) will be associated with higher rates of diffu- 
sion, use of the application (duration and content 
exposure) and cessation behaviours (eg, setting of quit 
dates) . 

IVIETHODS/DESIGN 
Study design 

This study involves two phases. Phase I was conducted 
from May 2012 to December 2012, and consisted of for- 
mative research designed to develop, test and optimise 



multiple features of a Facebook application titled 
'UbiQUITous'. Each feature is hypothesised to have a 
differential effect on the R of the application. Phase II is 
an ongoing randomised controlled trial that uses a frac- 
tional factorial design to determine the primary compo- 
nents of an online intervention for smoking cessation 
that determine its diffusion through a social network. 
The trial in phase II uses the UbiQUITous app as the 
study environment. 

The generalised diffusion model guiding this study is 
presented in figure 2. Initial seed users are recruited (A) 
using purchased advertising within Facebook and earned 
media (unpaid publicity, such as a newspaper article or 
word of mouth). Data on resulting adoption (ie, app 
install) and utilisation (B and D, respectively) — including 
frequency and duration of use and content exposure — 
are automatically recorded for analysis. To evaluate diffu- 
sion (C and E), we use metrics of viral spread including 
the number of contacts ('friends') per user, the period of 
active use of the component ('infectious period') and 
the number of transmissions to determine the basic R of 
each studied component. Our software records all data in 
real-time to a relational database for later reconstruction 
of network maps and diffusion pathways. Detailed data 
collection methods are presented below. Based on the 
basic design patterns within Facebook, we divided (3 (the 
metric of contagiousness) into two forms: f^-passive which 
results from observed behaviour and f^-active which 
results from direct invitations or proactive, intentional 
contact from one user to another. 

Phase I: initial development and optimisation of the 
cessation application 

In phase I of this trial, Collins '^^ MOST design method 
guided our efforts to break the proposed intervention 
into individual features and prototype each feature prior 
to evaluating the full intervention in a large-scale rando- 
mised trial. We augmented our internal software devel- 
opment team with expertise from an external graphic 
design firm to develop the visual components of the 
intervention. We prototyped multiple features, settling 
on six that were technically feasible, yielded useful data 
and resonated with our pilot users (see table 1). These 
six features could be set at multiple levels, each targeting 
a single element of diffusion (t, (3 or Z). To keep the size 
of the factorial model reasonable, we limited each 
feature to two levels (generally either on/off or high/ 
low). 

After development but prior to embarking on the full 
randomised trial, we consumer tested and refined each 
feature by testing a successive series (3 versions of the 
app. While this phase enabled us to detect programming 
and data recording errors, the primary focus was on 
iteratively evaluating and optimising the performance of 
each individual feature prior to the expense of a full ran- 
domised trial.^^ Facebook offers a free-development 
environment in which any third-party developer can 
create apps. For each (3 app, we launched the application 
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Figure 2 Viral diffusion model. 



features within Facebook and used paid advertising to 
recruit users. Based on user behaviour in the app and 
qualitative feedback gathered via short surveys to users, 
we made data-driven refinements in layout, presentation, 
content and message schedule, and evaluated their 
impact on the target metrics (R, t, (3 or Z). Following 
refinement and optimisation, we proceeded to full 
recruitment and randomisation using a factorial model 
in phase II, described below 

Phase II: evaluation of diffusion in a large-scale 
randomised trial 

The six features developed in phase I were translated 
into a factorial model. A full six-feature factorial would 
result in 64 separate cells. We simplified the matrix by 
combining features targeting the same variables to 
create four separate factors: t (expanded content and 
proactive contact), P-active (active diffusion: invites and 
social comparison) and P-passive (passive diffusion.- 
sharing) and z (non-smoker supporters), resulting in a 
16-cell factorial matrix (2 levels x 4 factors=16 cells). A 
final simplification eliminated the four cells that had no 
theoretical potential for diffusion, where P-active 
(invites) and P-passive (sharing) were disabled, resulting 
in a fractional factorial model with 12 cells (table 2). 

Setting and participants 

The randomised trial is conducted entirely within 
Facebook with all recruitment, screening, enrolment 
and randomisation automated by our clinical trials man- 
agement software. Participants are registered users of 



Facebook, a free social networking website. To be eli- 
gible, individuals must be current smokers, age 18 years 
or older and have an existing Facebook account. While 
seed enrolment targets English-speaking US residents, 
there are no language or residency restrictions. A 10% 
subsample is randomly selected from the initial seeds for 
additional data collection and follow-up. 

Recruitment 

Initial adopters ('seeds') are recruited primarily via 
online advertisements within Facebook that feature the 
app name, an app-related image and a short snippet of 
text advertising our free quit smoking app. Individuals 
clicking through to the app are shown a Facebook dia- 
logue box asking for permission to install the applica- 
tion. Following app install, users provide informed 
consent for the study. Additional waves of participants 
are recruited via snowball methodology (ie, viral spread) 
as they are informed or invited to participate by friends 
within the network. Individuals who have a friend 
already using the application ('descendants') are 
enrolled in the study and represent the outcome of 
interest (ie, diffusion). 

Inclusion criteria for seed participants are: US residency, 
current smoking, age 18 or older, have an 
English-language Facebook account and an email 
address, acceptance of Facebook permissions for app 
installation and provide study-informed consent. The 
only exclusion criterion is having one or more Facebook 
friends who have already installed the application. Age, 
existing friends who are already app users and location- 
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Table 1 Application feature matrix 



Target 
variable 



Application feature 



Levels 



I 



Expanded content (quit guides) 

► A single generic quit guide supplemented by 9 
additional topic-specific quit guides 

► 'Crave button' that when pushed randomly displays 
a humorous video from a content library 

Proactive contact (app requests) 

► Facebook requests: small icon appears on 
bookmark with flyover messages 

► App notifications: user receives a notification when 
friends achieve specific cessation milestones 

► Email: direct email on installation to the participant 
reminding them to come back 

p-passive Passive diffusion (sharing) 
(pp) ► Content (quit guides, crave content, check-ins, 

badges, money/life saved) can be shared on a 
user's Timeline 

► Badges earned and quit status are automatically 
posted to a user's Timeline 

p-active (pa) Active diffusion (invites) 

► During onboarding process, the user is prompted to 
identify and invite friends to either support them or 
quit with them 

► A persistent clickable interface element is available 
to invite friends 

► Users can post to a friend's Timeline using the 
'Cure' and 'Capture' buttons (even if the friend has 
not installed) 

Social comparison (leaderboard) 

► Compares individuals to others on various metrics. 
Table 1 compares individuals based on points 
earned in the app; Table 2 presents cumulative 
quitting metrics ($ saved, life saved) for the 
participants' friends versus other participants' local 
networks 

Version for non-smoker supporters 

► Non-smokers can install the application 

► Original quit guide tailored non-smokers; access to 
all other quit guides 

► Daily check-ins providing content on how to help a 
friend stay smoke-free 

► OthenA/ise identical experience to smokers 



On: has all quit guides, all crave content, has 
proactive app notifications, and gets an email on 
installation 

Off: has one quit guide, limited crave content, no 
proactive app notifications and no email on 
installation 



On: content is sharable. Timeline posts are generated 
Off: content not sharable. Timeline posts not 
generated 



On: user can invite friends, post to friends' Timelines, 
and has both tabs of the leaderboard 
Off: user cannot invite friends, cannot post to friends' 
Timelines through the app, and only has the 
individual tab of leaderboard 



On: smokers can have non-smoker supporters use 
the app 

Off: non-smoker supporters cannot use app (can 
install, but once declared non-smoker the app has no 
functionality) 



related eligibility are assessed in real-time immediately 
upon installation. Informed consent is required in order 
to proceed into the app. Smoking status is assessed 
immediately after informed consent. Ineligible users 
who provide informed consent may still use the app, but 
are excluded from the study. 

Subsample participants are randomly selected at a vari- 
able rate which is manually adjusted based on comple- 
tion rates of the subsample survey to yield a final 
proportion of 10% of seed users. Subsample participants 
are reimbursed US$20 per completed survey. 



Randomisation 

Seed users are randomised to 1 of 12 cells using an adap- 
tive 'biased-coin' strategy^^ which keeps the 12 cells in 
relative balance over the course of the trial. The prob- 
ability of an individual being assigned to any given cell is 
adjusted in real-time by the clinical trials management 
system based on any pre-existing imbalance between the 
cells. 

Descendants are users who have one or more Facebook 
friends and who have already installed the application 
('parent'). Descendants install the app and accept 
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Table 2 Cell manipulations 




Cell t Z p-Active p-Passive Diffusion three-level 
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0 
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0 
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0 


1 


0 


1 


0 


7 


0 


1 


1 


0 


1 


8 


0 


"I 


1 


1 


2 


9 


1 


0 
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0 


X 


10 


1 


0 
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1 


0 


11 


1 


0 


1 


0 


1 


12 


1 


0 


1 


1 


2 


13 


1 


1 


0 


0 


X 


14 


1 


1 


0 


1 


0 


15 


1 


1 


1 


0 


1 


16 


1 


1 


1 


1 


2 



Highlighted cells have (3-active and p-passive disabled and are 
suppressed to create a 12-cell fractional factorial model. 



informed consent in the same manner as seed users. 
They are assigned to the same cell as their parent. 
Descendants who have more than one friend in the 
study and for whom the diffusion channel (eg, active 
invite, Facebook ad) is unclear are assigned to the same 
cell as the friend who installed the app most recently. 
The designation of seed or descendant does not affect 
the user's app experience; they are simply identified as 
such in the relational database. Descendants may be 
smokers or non-smokers. 

Non-smokers who do not have a friend in the app 
(ie, no parent seed) are assigned to the cell with all fea- 
tures enabled, but neither they nor their descendants 
are included in the study itself. 

Facebook provides information on how an individual 
located the application, and if any of their friends are 
already users. Our application tracks potential paths of 
diffusion by embedding tracking tags within all links. 
New users who reach the app through an existing seed 
are identified in real-time and excluded from becoming 
seeds themselves. 

Intervention 

The intervention is derived from the US Public Health 
Service (PHS) '5As' model (Ask, Advise, Assess, Assist and 
Arrange).^ Content is based largely on PHS cessation 
materials for smokers supplemented by content written by 
the intervention team. Content is designed to motivate 
smokers to quit, provide support around a quit date, 
inform users of the benefits of quitting and build self- 
efficacy. On installation, users are greeted by the applica- 
tion's central character, Dr Youkwitz, who Asks participants 
if they smoke and Advises smokers to quit. He then Assesses 
their readiness to make a quit attempt and Assists them by 
providing a tool ('Quit Date Wizard') for planning a quit 
attempt and setting a quit date (see figure 3). Quit dates 



are stored for analysis and are used for tailoring and target- 
ing in other intervention components. If a user sets a quit 
date, the app displays a countdown to that date or an esti- 
mate of savings since that date (money saved, estimate of 
life saved). Users who do not set a quit date in their first 
visit may set one at any time. The application also Arranges 
follow-up in the form of daily check-ins with Dr Youkwitz 
who provide tailored and personalised information and 
support, and gather self-reported smoking status. Users 
randomised to cells that have the variable t turned on 
receive proactive Facebook app requests alerting them that 
a check-in is ready for them in the app. Participants who 
set a quit date are prompted at each check-in to confirm 
their quit date or update their smoking status. Smokers 
who have not set a quit date receive a variety of daily check- 
ins that include prompts to set a quit date, as well as 
evidence-based content incorporating the '5 Rs' 
(Relevance, Risk, Rewards, Roadblocks and Repetition) 
derived from the PHS guidelines. Users can receive check- 
ins for a year after their quit date. 

The app employs simple game mechanics (points and 
badges) and a cartoon representation of Dr Youkwitz's 
lab, where the participant is exposed to smoking cessa- 
tion information and tools (see figure 4). The longer a 
user stays engaged with the app, the more he/she is 
exposed to an unfolding narrative: Dr Youkwitz's experi- 
ments with a new anticraving drug have gone awry and 
have turned the user's friends into 'craving zombies'. 
Users can earn doses of a 'cure' by using various features 
of the app or by bringing their friends to the lab 
(ie, inviting to the app) to be cured and to provide 
support. This integration of game mechanics was 
designed to mirror the existing applications on 
Facebook, such as Farmville or Words With Friends. 

This study utilises a factorial design to test the effects 
of multiple components of the cessation application 
(see table 1). 

t (Duration of exposure) features 

Duration of exposure is maximised through expanded 
content and proactive contact. The app provides informa- 
tion in the form of a general quit guide or multiple topic- 
specific quit guides (eg, cessation and weight loss/main- 
tenance, cessation and stress management) as well as 
library of short YouTube videos and animated gifs that can 
be accessed by pushing a 'Crave Button'. We hypothesise 
that the availability of smoking cessation informational 
content will be a strong driver of ongoing utilisation (thus 
increasing t) . Proactive contact is implemented by encour- 
aging users to come back to the app through a reminder 
that appears within the Facebook interface whether the 
user is using the application or not. These reminders also 
appear when a friend has installed the app, or when an 
installed friend hits a quit milestone. 

(3 (Contagiousness) features 

Individuals in online networks are exposed to personal 
information of approximately twice as many individuals 
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1. SET A QUIT DATE! 




Figure 3 Application Quit Date Wizard. 



as they nominate as buddies or with whom they actively 
communicate.^^ The app leverages this network phe- 
nomenon by manipulating sharing and competition- 
driven app use to drive contagion, both of which map 



well to social support mechanisms that rely on informa- 
tion transfer and normative influence. 

P-Passive allows users to share app content (eg, when 
they set a quit date or earn a badge or content from a 



QUIT DATE 




Terms of Service 1 Privacy Policy I Lab Tour I Zombies?! I Contact Us 



Figure 4 Application main screen. 
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quit guide) on their Timeline for friends to see. The 
app automatically posts on the participant's behalf when 
quit milestones are achieved (eg, setting a quit date, 
staying quit for consecutive days, reaching 1-month 
smoke-free). Each post to a participant's Timeline — 
either by themselves or by the app — generates opportun- 
ities for their friends to actively engage with the user's 
quit attempt by liking, commenting on, sharing the 
application-generated object or clicking on the shared 
content. Individuals who have not yet installed the appli- 
cation who click on app content are taken to a page 
with further information (eg, health benefits the user 
attained by reaching 1 -month smoke-free) and encour- 
aged to install the app to support their friend. 

P-Active allows participants to invite members of their 
Facebook network to install the app. The app 
encourages participants to invite others for cessation 
support and also to achieve game-based rewards. 
Participants may also share content from the application 
directly to a Facebook friend's Timeline or Wall. 
Network-level data are also displayed so that participants 
are exposed to goal-driven and normative information 
that compares them with others and to prespecified 
metrics (eg, number of friends with application 
installed, individual 'game points' earned via engage- 
ment with the application and hitting cessation-based 
milestones and collective life saved by the participant 
and their installed friends). The information and pres- 
entation are designed to encourage individuals to 
actively recruit others to participate. 

Z (number of contacts) feature 

In order for an intervention targeted at smokers to 
spread with maximum efficiency from duster to duster 
(bridging) it needs to induce a 'carrier' state in non- 
smokers. Aversion of the app for non-smoker supporters 
allows non-smokers to provide support and has content 
tailored for non-smokers. Seed users are randomised to 
a version that can be shared with non-smoking friends 
or a version that is restricted to sharing with other 
smokers. 

Data collection and measures 

The majority of data collection occurs through an appli- 
cation programming interface (API) provided by 
Facebook. The API allows our systems to interact directly 
with Facebook's database to retrieve data about individ- 
ual users and their immediate social network. Since this 
study is a test of diffusion, we deliberately chose not to 
insert additional questions into the standard application 
installation process. Each participant is identified with a 
unique numeric identifier provided by Facebook. 

To supplement the limited demographic data available 
through Facebook, we subsample 10% of seed users to 
further characterise study participants and to provide an 
estimate of intervention effectiveness. Measures col- 
lected from all seed (and descendent) users are listed 



below. Measures collected only from the subsample are 
indicated as such. 

Facebook data 

Data available from Facebook include email address, 
date of birth, gender, location, hometown, photos, likes, 
groups and a list of friends (including friends' birthdate, 
gender, location, likes, relationship to user and photos). 
Location and hometown information is optional within 
Facebook and not always available. Connections between 
a participant's friends are gathered automatically when 
available. Photos, groups, likes and location are used to 
construct and weight a social network graph using mul- 
tiple co-occurrences as evidence of a stronger tie. 

Autonnatecl process tracking nneasures 

A Facebook member may choose to install an applica- 
tion ('become infected') based on advertising or other 
earned media, observation of others' behaviour or 
direct invitation. Data on daily advertising expenditures, 
exposures and subsequent click-throughs are recorded 
automatically into the relational database. Standardised 
mechanisms within the Facebook API are used to record 
'invitations'. Timeline posts and subsequent 'acceptance' 
or click-throughs by individuals, allowing a clear chain of 
diffusion to be established back to an initial adopter and 
a precise calculation of R at each degree of separation. 
For application adoptions not specially mapped to an 
individual user (eg, where an individual is exposed to 
information about multiple friends using the interven- 
tion, but who is not specifically invited) , we record their 
friend who installed most recently as a separate 
'guessed' parent. 

While abstinence is not a primary outcome metric in 
this study, information on quit dates is used as a marker 
of smoking status. This information will be used during 
analysis to extrapolate smoking status at arbitrary time 
points to reconstruct social network structure. In our 
earlier work, we have found that quit dates occurring in 
the past is a useful proxy for smoking status in descrip- 
tive analyses. 

For all participants, we record application installation, 
each return visit, specific pages viewed, total duration of 
the visit and use of app tools (eg. Quit Date Wizard, 
daily check-ins). Additionally, we record application 
uninstallation or blocking and 'likes' and 'dislike' tags. 
All application errors are recorded (eg, failure to post to 
a feed) to a standardised error log, as is downtime of 
the application and of Facebook itself. This error data 
are used in real-time to adjust performance, and if 
needed will be controlled for in final analyses. 

Baseline and follow-up self-report data (subsample only) 

Users selected for the subsample are presented with a 
survey when they indicate smoking status in the app. 
The survey is presented within the Facebook frame, and 
consists of demographic, smoking status and nicotine 
dependence, social support and network size questions. 
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Subsampled participants are contacted via email at 
30 days to take a web-based follow-up survey using a 
subset of baseline measures; a relatively short-time frame 
was selected to maximise response rates which were 
expected to be low Non-responders receive a reminder 
email at 33 days, and are contacted directly by research 
staff through a private Facebook message at 34 days. 

Smoking status and nicotine dependence 

Self-identified smoking status is assessed among all parti- 
cipants at enrolment. Subsample participants also report 
readiness to quit^^ and nicotine dependence is mea- 
sured with the 'time to first cigarette' item.^^ 

Social support for cessation 

Subsample participants complete an adapted version of 
the Partner Interaction Questionnaire (PIQ).^^ This 
measure assesses receipt of specific positive and negative 
behaviors from an individual who has followed the parti- 
cipant's efforts to quit smoking most closely. 

Networl< size 

Subsample participants complete a series of questions 
about "how many people named [first name] do you 
know."^^ The names selected satisfy a 'scaled-down condi- 
tion' such that, for example, if 15% of the population is 
men between age 21 and 40, then 15% of the people 
asked about also must be men between age 21 and 40. We 
implemented the names inventory as in McCormick et af^ 
and the Pew Internet Survey in 2011.^^ 

Power analysis 

We have compensated for the difficulty of estimation of 
sample size in this field by leveraging our capacit)^ for scal- 
able, low-cost recruitment. As an application becomes 
more accepted and valued by a group, others are more 
likely to value it.^^ Since this process is unpredictable at 
the individual level, the end exponential effects are highly 
unpredictable. The common-sense approach to this, pro- 
posed by Watts and Dodds,^^ is 'the big seed model', and 
involves seeding the application to as many initial indivi- 
duals as possible, rather than carefully targeted few. In a 
small network this can be an issue, as initial seeds may 
know each other, thus contaminating the diffusion 
metrics; however, in large social utilities with tens to hun- 
dreds of millions of users this is statistically less likely. 

Examples of basic R for viral marketing campaigns in 
other online modalities have ranged from 0.041 to 2.^^ 
The study is powered for a sub 1.0 R and at only the first 
degree (RO) to guarantee productive analysis and results 
even if the intervention does not reach 'viral threshold'. 
Using data from prior studies in the business and social 
marketing literature, we estimated sample size calcula- 
tions for individual cell comparisons and estimate R 
ranging from 0.1 to 0.5. We calculated that a study size of 
N=8000 would provide 88% power to examine all 
between-factor analyses with a minimal detectable differ- 
ence in basic R of 0.1. Increasing the sample size to 



12 560 yields the ability to examine the interaction effects 
at the same detectable difference and a power of 80%. 

Statistical analyses 

Outcome data will be obtained in the form of counts of 
new cases arising from direct contact with primary seed 
subjects, and exposure in the count of total seed subjects 
under each treatment cell. The design was a fractional fac- 
torial; there was no manipulation without passive and 
active diffusion (see table 2 for manipulations). The diffu- 
sion manipulations will be collapsed into a single three- 
level categorical treatment. Analysis will be conducted 
using Poisson regression, using Generalised Linear Models 
with a log-link and Poisson family. Design variables will be 
entered as predictors representing the treatment combina- 
tions, with the reference category representing minimal 
content, no non-smoker support and only passive diffu- 
sion. Results will be reported as baseline R for the refer- 
ence group, and incidence rate ratios for all other entries 
and hypotheses were tested at a=0.05. We will test interac- 
tions for entry into the model, and omit the interactions if 
they are not significant. Post hoc comparisons may be 
made after fitting the regression model using the Wad test. 

ETHICS AND DISSEMINATION 
Informed consent 

We use a two-step consent process: the participant first 
provides consent to Facebook for the release of their data 
via a dialogue box within Facebook itself (see figure 5), 
and then provides informed consent to the study. This 
study was deemed eligible for abbreviated consent as per 
Federal regulations. 

Dissemination 

In theory, study data can be anonymised and made suit- 
able for data sharing; however, this has proved highly 
challenging in practice and risks of disclosure remain in 
many datasets.^^ If anonymity of participants can be 
assured, we intend to make data available to other inves- 
tigators through either a National Institutes of Health 
(NIH) mechanism (CaBIG) or a non-profit academic 
mechanism such as the Dataverse project. We intend to 
repackage our source code as an open source platform 
for performing research within Facebook, and welcome 
potential collaborators. 

The study results will be disseminated through confer- 
ence presentations and peer-reviewed manuscripts. 

Provide UbiQUITous your public profile, friend list, 
email address, relationships, birthday, groups, 
hometown, current city, photos, likes and your friends' 
relationships, birthdays, groups, current cities, photos 
and likes. 



App Terms • Privacy Policy ^^^^01 Cancel 



Figure 5 Facebook data transfer consent screen. 
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Initial results of advertising and recruitment methods 
have been presented in abstract form at academic con- 
ferences, while implementation and programming 
methods have been presented by the development team 
at engineering conferences. The main outcomes of the 
trial, planned social network analyses and secondary 
data exploration will be presented at future conferences 
and published in the peer-reviewed literature. Given the 
topic of this project, however, we are equally interested 
in novel forms of distribution of the findings themselves 
through social networks. We are experimenting with 
building audiences with Tumblr (a blogging platform) 
and Twitter, and intend to publish at least a portion of 
the results in open access journals. 

DISCUSSION 

This protocol describes an experiment to explore a 
novel mechanism to disseminate evidence-based treat- 
ment for smoking cessation using one of the largest 
online social networks, Facebook. Results from this study 
will add to the knowledge base about constructing inter- 
ventions capable of self-propagation and distribution 
and how they may influence behaviour in local net- 
works. Interventions delivered through online social net- 
works offer potential not only to enhance social support 
but also to enrich social influence. In an existing 
network, an individual who quits smoking exerts an 
effect on network ties, causing collateral or even cascad- 
ing smoking cessation across multiple degrees of separ- 
ation and potentially producing a cumulative impact 
greater than that would be predicted by efficacy rates 
alone. This cascade has the potential to serve as a pro- 
found multiplier for public health spending. We antici- 
pate that a new generation of research protocols 
leveraging complexity science will explore not just viral 
diffusion, but the interdependent impact of diffusion 
and uptake on social and behavioural processes. 

There are several limitations to this study that stem 
from the nature of Facebook itself. The most significant 
is the trade-off between maximising dissemination and 
the collection of personal information. Additional data 
collection would have been 'invasive' relative to con- 
sumer expectations within online social network and 
would potentially suppress the primary outcome of the 
trial (ie, viral spread). We deliberately chose not to 
evaluate cessation outcomes in this trial since doing so 
would have added a significant burden to participants 
and dampened our outcome of interest. Future research 
should address the question of efficacy. At the time of 
writing the initial proposal for this project, we acknowl- 
edged a risk that prior to, or during this study, the social 
network landscape might change or Facebook could 
change its internal mechanisms. We based this proposal 
(and the pilot work) on what appear to be the basic and 
common elements of the platform that seemed unlikely 
to change. Not surprisingly, a number of minor 
Facebook platform changes occurred prior to the phase 



II recruitment, requiring protocol changes that are 
reflected in this document. We have found having an 
active, in-house engineering team invaluable in keeping 
up with changes in the Facebook environment. 

Ultimately, if this intervention approach succeeds in 
demonstrating viral spread, the project will have the 
potential to substantially shift how tobacco treatment, or 
any other health behaviour, services are marketed and 
delivered. Viral distribution of a behavioral intervention 
through existing social networks could be applied to 
multiple health conditions, including smoking, obesity, 
nutrition and alcohol. Our hope is that this study 
informs a near-term, future generation of effective 
health interventions to be disseminated to large popula- 
tions in a low-cost, efficient manner. 
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